Uniform Deviation Bounds for Unbounded Loss Functions like k-Means
نویسندگان
چکیده
Uniform deviation bounds limit the difference between a model’s expected loss and its loss on an empirical sample uniformly for all models in a learning problem. As such, they are a critical component to empirical risk minimization. In this paper, we provide a novel framework to obtain uniform deviation bounds for loss functions which are unbounded. In our main application, this allows us to obtain bounds for k-Means clustering under weak assumptions on the underlying distribution. If the fourth moment is bounded, we prove a rate of O ( m− 1 2 ) compared to the previously known O ( m− 1 4 ) rate. Furthermore, we show that the rate also depends on the kurtosis — the normalized fourth moment which measures the “tailedness” of a distribution. We further provide improved rates under progressively stronger assumptions, namely, bounded higher moments, subgaussianity and bounded support.
منابع مشابه
Uniform Deviation Bounds for k-Means Clustering
Uniform deviation bounds limit the difference between a model’s expected loss and its loss on a random sample uniformly for all models in a learning problem. In this paper, we provide a novel framework to obtain uniform deviation bounds for unbounded loss functions. As a result, we obtain competitive uniform deviation bounds for k-Means clustering under weak assumptions on the underlying distri...
متن کاملRelative Deviation Learning Bounds and Generalization with Unbounded Loss Functions
We present an extensive analysis of relative deviation bounds, including detailed proofs of twosided inequalities and their implications. We also give detailed proofs of two-sided generalization bounds that hold in the general case of unbounded loss functions, under the assumption that a moment of the loss is bounded. These bounds are useful in the analysis of importance weighting and other lea...
متن کاملMoment-based Uniform Deviation Bounds for k-means and Friends
Suppose k centers are fit to m points by heuristically minimizing the k-means cost; what is the corresponding fit over the source distribution? This question is resolved here for distributions with p ≥ 4 bounded moments; in particular, the difference between the sample cost and distribution cost decays with m and p as mmin{−1/4,−1/2+2/p}. The essential technical contribution is a mechanism to u...
متن کاملTight Lower Bound on the Probability of a Binomial Exceeding its Expectation
We give the proof of a tight lower bound on the probability that a binomial random variable exceeds its expected value. The inequality plays an important role in a variety of contexts, including the analysis of relative deviation bounds in learning theory and generalization bounds for unbounded loss functions.
متن کاملCoefficient Bounds for Analytic bi-Bazileviv{c} Functions Related to Shell-like Curves Connected with Fibonacci Numbers
In this paper, we define and investigate a new class of bi-Bazilevic functions related to shell-like curves connected with Fibonacci numbers. Furthermore, we find estimates of first two coefficients of functions belonging to this class. Also, we give the Fekete-Szegoinequality for this function class.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1702.08249 شماره
صفحات -
تاریخ انتشار 2017